Evaluation of Automatic Speaker Recognition Approaches
نویسندگان
چکیده
This paper deals with automatic speech recognition in Czech. We focus here on context independent speaker recognition with a closed set of speakers. To the best of our knowledge, there is no comparative study about different speaker recognition approaches on the Czech language. The main goal of this paper is thus to evaluate and compare several parametrization/classification methods in order to build an efficient Czech speaker recognition system. All experiments are performed on a Czech speaker corpus that contains approximately half one hour of speech from ten Czech native speakers. Four parameterizations, which are mentioned in other studies as particularly successful for the speaker recognition task, are compared: MEL Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction Coefficients (PLPC), Linear Prediction Reflection Coefficients (LPREFC) and Linear Prediction Cepstral Coefficients (LPCEPSTRA). Two classifiers are compared: Hidden Markov Models (HMMs) and Multi-Layer Perceptron (MLP). In this work, we further study the impact of varying sizes of training corpus and test sentence on the recognition accuracy for different parametrizations and classifiers. For instance, we experimentally found that the recognition is still very accurate for test utterances as short as two seconds. The best recognition accuracy is obtained with LPCEPSTRA/ LPREFC parametrizations and HMM classifier.
منابع مشابه
Performance evaluation of Statistical Approaches for Automatic Text-Independent Speaker Recognition using Robust Features
This paper introduces the performance evaluation of statististical approaches for Automatic-text-independent Speaker Recognition system. Automatic-text-independent Speaker Recognition system is to quickly and accurately identify the person from his/her voice. The study on the effect of feature vector size for good speaker recognition demonstrates that the feature vector size in the range of 18-...
متن کاملAutomated Speaker Recognition Methods: a Critical Review
In this paper, an overview of state-of-the-art approaches for speaker recognition is presented. Due to the increased scalar of dialogue system applications the interest in that province has grown boomingly in certain years. Nevertheless, there are many open up shots in the field of automatic speaker recognition. The techniques, evaluations, and implementations of various proposed speaker recogn...
متن کاملMethodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speech
Speaker Diarization and Automatic Speech Recognition have been a topic of research for decades. Evaluating the developed systems has been required for almost as long. Following the NIST initiatives a number of metrics have become standard to handle these evaluations, namely the Diarization Error Rate and the Word Error Rate. The initial definitions of these metrics and, more importantly, their ...
متن کاملHuman Assisted Speaker Recognition In NIST SRE10
The NIST series of Speaker Recognition Evaluations (SRE’s) have, since 1996, evaluated automatic systems for speaker recognition. The 2010 evaluation (SRE10) also included a test of Human Assisted Speaker Recognition (HASR), in which systems based, in whole or in part, on human expertise were evaluated. Participants were invited to complete the trials in one of two small subsets of the full set...
متن کاملPerformance Evaluation of Automatic Speaker Recognition Techniques for Forensic Applications
Speaker recognition is a biometric technique employed in many different contexts, with various degrees of success. One of the most controversial usage of automatic speaker recognition systems is their employment in the forensic context [1, 2], in which the goal is to analyze speech data coming from wiretappings or ambient recordings retrieved during criminal investigation, with the purpose of r...
متن کامل